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Abstract — In this paper, we present a coding-theoretic frame- 
work for message transmission over packet-switched networks 
employing routing in network nodes. Network is modeled as a 
channel which can induce packet erasures, errors, insertions, and 
out of order delivery of packets. The framework follows closely 
the one presented by Kotter and Kschischang for networks based 
on random linear network coding. 

Index Terms — Subset codes, packet networks, routing, permu- 
tation channel, packet erasure codes, forward error correction. 



I. Introduction 

PACKET-SWITCHED networks which employ routing as 
a means for transmitting packets between pairs of users 
are in widespread use in communications today [1|. We 
formulate here a framework for end-to-end forward error 
correction (FEC) in such networks. We are motivated by the 
work of Kotter and Kschischang |2) in which the authors 
define so-called subspace codes and show that these codes, and 
particularly their constant-dimension versions, are adequate 
constructions for error and erasure recovery in networks em- 
ploying random linear network coding. We will follow closely 
their exposition in Section [HI] since the scenario considered 
here yields a very similar mathematical model. 

II. Packet networks 

Consider a pair of users in a given packet-switched network, 
one user being the source and the other the destination of the 
message (we consider only unicast here, but the generalization 
to multicast is straightforward). The message sent by the 
source consists of I packets. Due to varying topology and load, 
these packets can be sent over different routes in the network 
and, as a consequence, they are received in practically arbitrary 
order. This is especially true for, e.g., mobile ad-hoc networks 
where the topology is rapidly changing, and heavily loaded 
datagram-based networks in which the packets are frequently 
redirected in order to balance the load over different parts 
of the network. Hence, we will model networks as packet 
permutation channels which can deliver injected packets in 
arbitrary order at the destination. Apart from permutations, 
there are various other unwanted effects the network can 
impose on injected packets. We consider here three of them: 
erasures (deletions), errors, and insertions. Packet erasures 
can occur for many reasons, finite buffering capabilities of 
routers, router/link failures, etc. Errors are caused by noise, 
malfunctioning of network equipment, etc. Finally, packet 
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insertions are a form of malicious behaviour, where some user 
imitates the true source of the data, and wants the receiver to 
misinterpret the data. To sum up, in this paper, networks are 
modeled as permutation channels with erasures, errors, and 
insertions. 

III. Codes for packet networks 

Let us start by a somewhat general approach. Consider a 
communication channel acting on an input by some random- 
ized transformation (not including errors, erasures, insertions, 
deletions, etc.). In the case of routed networks, this transforma- 
tion corresponds to permuting the packets in an unpredictable 
and essentially random way. In the case of random linear 
network coding (RLNC), the transformation represents random 
linear combining of the packets, and so on. The idea of sending 
information through such channels is very simple: Encode the 
information in an object which is invariant under the given 
transformation. This has led Kotter and Kschischang [2| to the 
abstraction of the channel corresponding to RLNC networks 
(the operator channel) and the definition of codes for such a 
channel. In this case, the object which is invariant to random 
linear combining of packets is the vector space spanned by 
those packetfl 

In the case under consideration here, namely routed packet 
networks, we need an object which is invariant under random 
permutations of the packets. Such an object is a set. Therefore, 
a natural idea is to consider sets as possible "codewords" 
for this channel. As in any code, the codewords need to be 
sufficiently far apart so as to enable the receiver to recover 
from erasures and errors. In the following subsection we will 
define these codes, named subset codes, in a precise (and 
somewhat abstract) way. Section IIII-BI explains subset codes 
through more concrete examples. 

A. Power sets and subset codes 

Let S be a finite set, and let P(S) denote the power set of 
S, i.e., the set of all subsets of S. A natural metric associated 
with this space is defined by: 



d(X,Y) = \X AY\ 



(1) 



for X,Y £ V(S), where A denotes the symmetric difference 
between sets. It can also be written as d(X, Y) = \X U Y \ — 
\XnY\ = \X\ + \Y\-2\XnY\ = 2\XUY\-\X\-\Y\. This 
distance is the length of the shortest path between X and Y in 
the Hasse diagram [ 3 ] of the lattice of subsets of S ordered by 
inclusion. It is analogous to the subspace metric defined in Q. 
This diagram plays a role similar to the Hamming hypercube 

1 Strictly speaking, it is invariant only with high probability, i.e., if the 
transformation is full-rank. 
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for the classical codes in the Hamming metric (actually, it 
is isomorphic to the Hamming hypercube, see Section |IVt . 
Another useful metric is given by: 



d'(X,Y) =max{|X\Y|,|y\X|}. 



(2) 



It can also be written as d'(X,Y) = max{|X |, \Y\} - \X D 
Y\ = \X U Y\ - rnin{|X|, \Y\], and it is analogous to 
the injection metric for subspace codes [4|. Direct proofs 
that d and d! are indeed metrics are easy and very similar 
to the proofs for subspace and injection metrics, and we 
shall therefore omit them. In the following, we will only use 
distance d and refer to it as the subset metric. 

One can define codes in the space V(S) in the usual way. 
Namely, a subset code C is simply a nonempty subset of V(S). 
Important parameters of such a code are its cardinality, \C\, 
minimum distance: 



min d(X.Y). 

X,YeC, X^Y 



maximum cardinality of the codewords: 

max \X\, 
xec 



(3) 



(4) 



and the cardinality of the ambient set, \S\. If C C V(S) has 
minimum distance d, and every codeword is of cardinality at 
most £, we say that it is a code of type [log log \C\, d; £] (the 
base of the logarithm is generally arbitrary; we will assume 
that it is 2, and hence that the lengths of the messages are 
measured in bits). If all codewords of C are of cardinality 
I, we say that it is a constant cardinality code. A significant 
advantage of constant cardinality codes is that the receiver 
knows in advance how many packets it needs to receive in 
order to initiate decoding, similarly to the constant dimension 
codes in projective spaces |2|. The rate of an [n,k,d;£] code 
is defined by: 

In the intended application of subset codes, S will be the 
set of all possible packets, n = log|5| the length of each 
packet, and £ the number of packets one codeword contains. 
The source maps information sequence of length k bits to a 
codeword which is a set consisting of £ packets of length n 
bits each, and sends these £ packets through a channel. In 
the channel, these packets are permuted, some of them are 
erased, some of them are received erroneously, and possibly 
some new packets are inserted by a malicious user. The 
receiver collects all these packets and attempts to reconstruct 
the codeword which was sent, i.e., the information sequence 
which corresponds to this codeword. 

With the above terminology at hand, one can also define 
the channel described in Section HI] in a more precise way. 
Namely, we observe a discrete memoryless channel with input 
and output alphabets equal to V(S). The channel is completely 
described by its transition probabilities (the probabilities of 
mapping the input subset X to the output subset Y, for all 
X,F e V(S)) which, on the other hand, are determined by 
the joint statistics of erasures, errors, and insertions of the 
elements of S. 



We next prove a simple, but basic fact about the correcting 
capabilities of subset codes. 

Theorem 1: Assume that a code C of type [n, k, d; £] is used 
for transmitting packets over a network. Then any pattern of 
p erasures, t errors, and s insertions can be corrected by the 
minimum distance decoder (with respect to the subset metric), 
as long as 2(p + 2t + s) < d. 

Proof: Let X G C be the set/codeword which is trans- 
mitted through a channel. Let Y be the received set. If p 
packets from X have been deleted, and s new packets have 
been inserted, then we easily deduce that \X D Y\ > \X\ — p 
and \Y\ < \X\ — p + s. Observe further that errors can be 
regarded as combinations of erasures and insertions. Namely, 
an erroneous packet can be thought of as being inserted, while 
the original packet has been erased. Therefore, the actual 
number of erasures and insertions is (at most) p + t and s + 1, 
respectively. We therefore conclude that |Xn Y| > \X\— p — t 
and \Y\ < \X\ — p + s, and so 

d(X, Y) = \X\ + \Y\ - 2\X n Y | < p + 2t + s. (6) 

Now, if 2(p+2t+s) < d, then d(X, Y) < [^-\ and therefore 
X can be recovered from Y. ■ 
If only erasures are allowed in the channel, we will have 
d(X, Y) = p and a sufficient condition for unique decodability 
will be p < L^J . 



B. Examples of subset codes 

In this section, we give a simple example of subset codes 
to illustrate the above definitions. 

How does one encode information in a set? One possible 
solution (which is widely used in practice) is to add a sequence 
number to every packet sent, thus achieving resilience to 
arbitrary permutations. To illustrate this, assume that the 
source has two packets to send, po and p\. Note that, from 
the point of view of the receiver, the sequence (po,Pi) is not 
the same as the sequence (j>i,po); these two sequences carry 
different information. In the permutation channel, however, 
either of these two sequences can be received when (po,pi) 
is sent. The sender therefore sends (qo,qi) instead, where 
qi = (i,Pi) is the new packet formed by prepending a 
sequence number to the packet pi. Note that sequences (qo, qi) 
and (qi,qo) are now identical to the receiver because in both 
cases it will extract (po,pi) and further process these packets. 
This means that the carrier of information is actually a set 
{9o,9i} = {(0,Po), (l,px)}. This approach, combined with 
some classical packet-level error-correcting code, provides an 
example of subset codes that we describe next. 

Let A be the set of all packets the source can possibly send. 
Assume that |^4| = 2 m , so that we can think of information 
packets as having m bits. Assume further that the source 
wishes to send k such packets, po, . . . ,Pk-i to a destination 
over a network, i.e., over a permutation channel with erasures, 
errors, and insertions. To protect the packets the source defines 
some packet-level block code (see, e.g., [5Q, and uses the 
corresponding encoder to map these k packets to £ > k 
packets, qo, . . . , qt—\. To cope with the permutations in the 
channel, the source further adds a sequence number of length 
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log 2 £ bitfS to every packet qi. This gives a subset code of 
type [m + log 2 £, km, d; £], where d is its minimum distance 
whose concrete value is irrelevant for this example. In words, 
the length of the packets is m + log 2 £ bits, there are 2 km 
possible information sequences (and hence the same number 
of codewords), and each codeword consists of £ packets. The 
rate of the code is therefore R = — ^ — n . 

To further clarify the above arguments, assume that the 
Reed-Solomon code is used as a packet-level block code in 
the above scenario. Namely, the message to be sent (k packets, 
Po, . . . ,Pk-i, of length m bits each) is being regarded as a 
polynomial of degree at most k — 1 over F 2 m : 

fc-i 

u{z) = Y d Pi^ i - ( 7 ) 

i=0 

The codeword represents the sequence of evaluations of 
this polynomial in £ fixed different points in F 2 ™ . De- 
note these points by a%,...,a£, so that the codeword is 
u(a%), . . . , u(ot£). The resulting code has minimum (Ham- 
ming) distance £ — k + 1 [6|. Now, it(ai)'s are being treated as 
packets (these are the c^'s from the previous paragraph), and 
each packet is being added a sequence number i (index of the 
point of evaluation of the message polynomial). As already 
explained, these sequence numbers enable the receiver to 
recover from permutations, but also from erasures, errors, and 
insertions because it can keep track of evaluation points. Fi- 
nally, the codeword corresponding to the information sequence 
(po, . . . ,Pk-i) is a set U = {{i,u{ai)) : i = 0, . . . , k - 1}. 
Since two polynomials u and v of degree k — 1 can agree 
on at most k — 1 different points, we conclude that \U PI 
V\ < k - 1 and therefore d(U, V) > 2(1 - k + 1). Thus, 
we have defined a constant cardinality subset code of type 
[m + log 2 £, km, 2(1 — k + 1); £}, and rate: 



£(m + \og 2 £Y 

This example is analogous to the construction of Kotter- 
Kschischang codes |2| for networks based on random linear 
network coding. 

Even though RS codes are maximum distance separable 
(6), the subset codes obtained in this way are not. Namely, 
adding a sequence number is not an optimal way of encoding 
information in a set (though this suboptimality is not a concern 
in practice, because sequence numbers only take a couple 
of bytes in the packet header). The other reason for non- 
optimality is that these codes are constant cardinality codes; 
larger codes can be obtained if one allows codewords of dif- 
ferent cardinality. This is analogous to the relation of general 
subspace codes in projective spaces and constant dimension 
codes [7]. In Section [IV] we discuss how one can construct 
optimal (in any sense) subset codes. 

IV. Another approach to subset codes 

Let 5 be a nonempty finite set with some implied ordering 
of its elements, and observe the space {0, 1}I S I of all binary se- 
quences of length \S\ (denoted also 2 s ). Each binary sequence 

2 For notational simplicity we disregard the fact that the actual length is 

riog 2 ^j. 



x E 2 s defines a subset ICS containing elements defined 
by the positions of ones in x. As is well-known, this mapping 
of subsets to binary sequences is an isomorphism between 
groups (V(S),A) and (2 s ,©), where © denotes the XOR 
operation (addition modulo 2). Furthermore, it is easy to show 
that the Hamming distance between two sequences x, y £ 2 s 
is precisely the subset distance between the corresponding 
subsets X,Y CS: 

d H (x,y)=w H (x®y) = \X AY\ = d(X, Y), (9) 

where w H denotes the Hamming weight of the binary se- 
quence. In other words, this mapping is also an isometry 
between metric spaces (V(S),d) and (2 s , d H ). This means 
that the subset codes in fact represent only another way to 
look at classical codes in the binary Hamming space, and 
vice versa. Constant cardinality codes are then analogous to 
constant weight binary codes. Note that the classical binary 
codes corresponding to [n, k,d;£) subset codes have parame- 
ters (2™, k, d). 

The above reasoning, though quite elementary, has an 
important implication. It shows that classical binary codes 
developed for binary channels (such as the Binary Symmetric 
Channel) define in a very natural way codes for correcting 
erasures, errors, and insertions in networks. In other words, 
the study of subset codes and their properties reduces to the 
well-known theory of binary codes. 

The following simple example illustrates the above notions. 

Example 1: Let S = {a, b, c, d}. Any subset of S can 
be identified by a binary sequence of length 4; for exam- 
ple {a, b} o 1100, {b, d} o 0101, etc. Consider now 
some code in {0, l} 4 , e.g., C = {1100,1010,0110,0011}. 
The subset counterpart of this code is then Cs = 
{{a, b}, {a, c}, {b, c}, {c, d}}. The distance between two sub- 
sets of S is the Hamming distance between the corresponding 
binary sequences, for example: 

d({a,b},{a,c}) = \{b,c}\ = 2 = d H (1100, 1010) (10) 

so that all properties of C directly translate into equivalent 
properties of the subset code Cs- The code Cs is a constant 
cardinality code of type [2, 2, 2; 2]. 

The above example can be extended to arbitrary sets S and 
binary codes C C 2 s . 

V. Some practical considerations 

To conclude the paper, we give below some comments on 
subset codes and the channel model that could be relevant for 
their analysis in practical scenarios. 

Comments on binary codes: One constraint on the binary 
codes corresponding to [n,k,d;£] subset codes should be 
pointed out. Namely, "practical" subset codes will certainly 
require that £ 2", i.e., that the number of packets in 
one codeword is much less than the number of all possi- 
ble packets. This means that binary codes corresponding to 
(practically feasible) subset codes will only have small weight 
codewords. Moreover, the fact that binary codes corresponding 
to [n, k, d; £] subset codes have exponential length (2™) places 
additional complexity constraints on the code design. 
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Comments on the channel model: The links in networks 
can generally be unreliable. For example, if a large packet 
is sent over a wireless link, it is highly probable that it will 
be hit by an error, i.e., that at least one of its bits/symbols 
will be received incorrectly. Furthermore, this error probability 
increases with the packet length n. In such a scenario it can 
happen (with fairly high probability) that all of the packets 
from the sent codeword are erroneous, in which case X n 
Y = and reliable recovery is impossible. Subset codes alone 
do not provide a good protection from errors in such cases. 
One way to solve this problem is to additionally protect each 
packet with its own error correcting or error detecting code. 
This solution is in agreement with current networking practice. 
Namely, since we treat here an end-to-end network model, 
it is assumed implicitly that (subset) coding is done on the 
transport (or even application) layer. In most networks, packets 
on lower layers (e.g., link and physical layer) include some 
error correcting/error detecting codes (such as LDPC codes for 
error correction combined with CRC codes for error detection). 
These codes effectively create a channel that we treat here, 
namely, they keep the link-layer packet error probability at a 
"reasonable" level. 

Packet insertions also deserve a comment regarding possible 
practical applications of subset codes. In general, by inserting 
enough packets an adversary can always prevent the receiver 
from correctly decoding the received set. Thus we also assume 
in our model that the number of insertions is relatively small, 



or at least that it behaves as a random variable whose param- 
eters we can estimate and then design the code with respect 
to this estimated channel statistics. This may not be the case 
in practice because insertions inherently represent deliberate 
interference, but our assumption can certainly be achieved by 
a proper authentication protocol; that way the receiver will 
recognize and disregard (most of) the inserted packets. That 
is to say that subset codes do not provide any cryptographic 
protection; insertions are treated here because they naturally 
fit in the model, along with erasures and errors. 

We note that the above comments on errors and insertions 
are also valid for subspace codes in network coded networks. 
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